"IDS a Wrap!"¶

A Culminating Workshop for the Course Introduction to Data Science



Objectives:¶

  • To reinforce and apply the stages of the data science pipeline—from data acquisition to analysis—using a real-time and semi-structured dataset.
  • To gain hands-on experience in collecting data from APIs using Python’s requests library, including authentication, handling JSON responses, and managing real-world API constraints.
No description has been provided for this image

This image illustrates a familiar reality we've already discussed: the practical workflow of data science. It highlights four broad stages—Acquire, Prepare, Analyze, and Act—and shows how the degree of effort typically peaks during the analysis phase, where we encounter the most trial-and-error through cycles of setting up, trying, and evaluating different approaches.

The pipeline provides the methodological structure to manage and extract value from data that is large, fast, diverse, uncertain, and complex.

However, as we’ve also acknowledged in our discussions, the data science pipeline is not as simple or linear as these four steps suggest. In practice, each stage contains a range of more granular and specific methods—from data cleaning, transformation, feature engineering, and model tuning, to interpreting results and communicating insights. The process is iterative, often messy, and highly dependent on the context and constraints of each project. This diagram serves as a helpful high-level summary, but the real work lies in the complexities within and between these stages.

Workshop Overview¶

In this culminating workshop, we’ll put that understanding into action. The objective is to gather real-world data from a live data source, particularly using an API, and walk it through the full data science pipeline. What makes this workshop especially relevant to today’s tools and trends is our integration of AI-powered assistants like ChatGPT to support decision-making, troubleshoot challenges, and streamline various tasks throughout the process.

From forming the right API queries to shaping the data for analysis and deriving meaningful insights, this activity highlights not just technical execution—but how AI can enhance human judgment and improve the overall flow of a data science project.

For this workshop, we’ll be featuring the Clash of Clans API as our primary data source. This API offers an excellent opportunity to experience real-world data work for several key reasons:

No description has been provided for this image
  1. It delivers real-time data, making it ideal for understanding concepts related to data velocity and live system integration.
  2. The data is not perfectly clean or structured, reflecting the messiness of real-world datasets, where inconsistencies, missing values, and nested structures are common.
  3. It presents a semi-structured format (JSON), challenging participants to parse and transform data before it can be used for analysis—mirroring typical preparation and wrangling tasks.
  4. The data is dynamic and contextual, requiring logical reasoning to extract insights (e.g., understanding player stats, clan activity, and performance metrics).
  5. Accessing the API involves authentication and rate limits, simulating real-world constraints when working with third-party data services.

Ask yourselves:

  1. How will we collect the data?
  2. What challenges might we face when cleaning or organizing this data?
  3. How can we turn raw game data into valuable insights?
  4. What questions can we ask or answer with this data?

Basic Concepts¶

No description has been provided for this image

To get the most out of this workshop, it's important to have a basic grasp of the following concepts, especially since we’ll be working directly with a real-world API:

  1. API (Application Programming Interface): An API is a set of rules that allows different software systems to communicate with each other over the internet or within applications.

  2. Endpoint: An endpoint is a specific URL path within an API that performs a particular function, such as retrieving or modifying data.

  3. Base URL: The base URL is the root address of the API, to which endpoints are appended to make requests (e.g., https://api.example.com).

  4. Request: A request is a message sent by a client to the API server asking for data or triggering an action.

  5. Request Methods: These indicate the type of action to perform, such as GET (retrieve data), POST (send data), PUT (update data), or DELETE (remove data).

  6. Headers: Headers are key-value pairs in an API request that provide additional information, such as content type or authentication credentials.

  7. Authentication: Authentication is the process of verifying the identity of the client making the API request to ensure secure access.

  8. API Token: An API token is a unique key used in the headers or query parameters to authenticate and authorize API requests.

  9. Status Code: A status code is a number returned by the API that indicates the result of the request (e.g., 200 for success, 404 for not found, 401 for unauthorized).

  10. JSON (JavaScript Object Notation): JSON is a lightweight data format used to structure and exchange data in API requests and responses.

  11. Query Parameters: Query parameters are key-value pairs added to the end of a URL to filter or modify the data returned by the API (e.g., ?search=books&limit=10).

Things to Consider¶

When sending a GET request to an API using Python's requests library, there are several important elements to consider to ensure that the request is successful and returns the expected data. First and foremost, every API is unique, so it’s essential to consult the API documentation before writing any code. The documentation will describe the available endpoints, required authentication methods, query parameters, rate limits, and the structure of both requests and responses. For example, some APIs may require you to send an API key in the headers, while others may use OAuth.

A typical GET request starts with the base URL and an endpoint, which together form the full request URL. For instance, in the URL https://api.example.com/v1/users, https://api.example.com is the base URL and /v1/users is the endpoint. Many APIs also accept query parameters, which allow you to filter or paginate the data. These are added at the end of the URL in key-value pairs, such as ?limit=10&sort=name.

Another crucial aspect is authentication. Most APIs require an API token or key to access their services. This token is often included in the headers of the request. For example, you might need to include a header like Authorization: Bearer <your_token_here> or x-api-key: <your_key_here>. Without proper authentication, your request will likely return a 401 Unauthorized status code.

Example Implementation

import requests

# API details from the documentation
base_url = "https://api.example.com"
endpoint = "/v1/users"
url = f"{base_url}{endpoint}"

# Query parameters to customize the request
params = {
    "limit": 5,
    "sort": "name"
}

# Headers including authentication token
headers = {
    "Authorization": "Bearer your_api_token_here"
}

# Send the GET request
response = requests.get(url, headers=headers, params=params)

# Handle the response
if response.status_code == 200:
    data = response.json()
    print("User Data:", data)
else:
    print("Request failed with status code:", response.status_code)

In this example, the params dictionary holds query parameters to limit and sort the data, and the headers dictionary includes the API token for authentication. Finally, always check the status code of the response to determine if the request was successful (e.g., 200 OK) or if there was an issue (e.g., 404 Not Found, 401 Unauthorized, etc.). Properly reading and handling the response — typically in JSON format — allows your program to function reliably across different APIs.

Now that we have discussed the objectives of this workshop, along with the key components and important considerations, we are ready to demonstrate how to make API requests using a real-world data source from the Clash of Clans API.

Groupings¶

Please form 6 groups, with no more than 5 members in each group.

To ensure a smoother and more engaging experience during the workshop activities, try to include at least one member who is familiar with the Clash of Clans game in each group. This familiarity will help in understanding the context of the API data more effectively, especially during the hands-on demonstration and group tasks.

If no one in your group is familiar with the game, you may refer to the following Wikipedia page for a quick overview of the Clash of Clans gameplay: https://en.wikipedia.org/wiki/Clash\_of\_Clans

Implementation¶

Required Libraries

  • load_dotenv – Used for securely storing and loading authorization tokens from a .env file to keep sensitive credentials out of your code.
  • plotly.express – A high-level plotting library used to create interactive and visually appealing charts for exploring various dimensions of the data.
In [128]:
import requests                # for communicating with the API framework
import json
import pandas as pd            # structuring responses
from dotenv import load_dotenv # handling sensitive keys (best practice)
import os                      # extract local keys

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

Before you can start collecting data, you’ll need to register and gain access to the Clash of Clans API. This involves creating a developer account and generating an API token that will authenticate your requests. Visit the following link and register an account.

https://developer.clashofclans.com/#/

Generating Your API Key

Once you've successfully registered for a Clash of Clans developer account:

  1. Navigate to My Account > Create New Key.
  2. You’ll be asked to provide an Allowed IP Address. This is required to ensure that only your machine can use the key. To find your current IP address for local development, visit: https://whatismyipaddress.com/
  3. Look for your IPv4 Address, copy it, and paste it into the Allowed IP field on the key creation page.
  4. After submitting, your API key will be generated—copy and store it securely. You’ll use it in your request headers to access the API.

Note: If you change networks (e.g., from home to school Wi-Fi), you’ll need to update your allowed IP address accordingly.

To protect your API key and avoid accidentally sharing it (e.g., when uploading code to GitHub), it's good practice to store sensitive information in a .env file.

  1. Install the python-dotenv library (if not already installed):

    pip install python-dotenv
    
  2. Create a .env file in the same directory as your Jupyter notebook or Python script.

  3. Inside the .env file, store your API key like this:

    API_TOKEN=your_actual_api_key_here
    
In [2]:
load_dotenv() # we need to run this code to parse the .env file where the token is stored.
Out[2]:
True
In [7]:
API_TOKEN = os.getenv('TOKEN')
BASE_URL = "https://api.clashofclans.com/v1/"
No description has been provided for this image

Read documentation

In [8]:
HEADERS = {
    'Authorization': f'Bearer {API_TOKEN}'
}

In an API query, the following components are required, especially for APIs where authentication is necessary:

  1. Base URL – The root address of the API.
  2. Endpoint – Specifies the exact resource to access.
  3. Headers – Contains metadata, including content type and authorization details (such as an API key or token).
  4. Query Parameters (if applicable) – Allows customization of the request.

At this point, we have already defined the following components:

  • ✅ Base URL
  • ✅ Headers (With the authorization)

Clash of Clans API Endpoints¶

Imagine you're visiting a large library. The base URL is the library’s main entrance, and the endpoints are the individual rooms or sections within the library, each containing different resources. Just as you need to specify which room to go to in order to find a specific book, you need to specify the endpoint to access the right resource in an API.

No description has been provided for this image

Read documentation

For example, the Clash of Clans API (COC API) offers multiple endpoints such as /clans, /players, and /leagues, each providing different sets of data related to those resources. The /clans endpoint, according to the documentation, requires at least one parameter—in this case, we will use locationId as the parameter for querying information about a specific clan.

However, there's one predicament: we don’t yet know what the locationId is. Fortunately, the COC API also offers a /locations endpoint, which we can use to query a list of available locations. Once we retrieve this information, we can easily find the locationId needed to query the /clans endpoint. This makes the process more manageable and ensures we can successfully make the request to retrieve clan data.

In [10]:
# os.path.join('str1/', 'str2') concatentaes two strings into one.
LOCATION_ENDPOINT = os.path.join(BASE_URL, 'locations')

At this point, we have already defined the following components:

  • ✅ Base URL
  • ✅ Headers (With the authorization)
  • ✅ Endpoint (location endpoint)

The /locations endpoint doesn’t require any query parameters, so we can make the request without specifying additional inputs.

Read documentation

1st API Request: Location (No Parameter)¶

In [12]:
location_response = requests.get(url=LOCATION_ENDPOINT, headers=HEADERS)

The request returns a response object that contains important methods and attributes we need to consider:

  • .json() – a method that parses and returns the response data in JSON format.
  • .reason – an attribute that provides a textual explanation of the status code (e.g., "OK" for a successful request).
In [18]:
location_response
Out[18]:
<Response [200]>
Status Code Message
200 Successful response. Returns a LocationList model with location data.
400 Client provided incorrect parameters for the request.
403 Access denied due to missing/incorrect credentials or insufficient API token permissions.
404 Resource was not found.
429 Request was throttled due to exceeding the allowed rate limit for the API token.
500 An unknown error occurred while handling the request.
503 Service is temporarily unavailable due to maintenance.
In [20]:
loc_data = location_response.json()

The loc_data is a dictionary, hence it contains keys. We can use the .keys() method to access which key can to explore.

In [21]:
loc_data.keys()
Out[21]:
dict_keys(['items', 'paging'])
In [27]:
countries = pd.DataFrame([i for i in loc_data['items'] if i['isCountry']])
other_region = pd.DataFrame([i for i in loc_data['items'] if not i['isCountry']])
In [28]:
countries.head()
Out[28]:
id name isCountry countryCode
0 32000008 Åland Islands True AX
1 32000009 Albania True AL
2 32000010 Algeria True DZ
3 32000011 American Samoa True AS
4 32000012 Andorra True AD
In [29]:
other_region
Out[29]:
id name isCountry
0 32000000 Europe False
1 32000001 North America False
2 32000002 South America False
3 32000003 Asia False
4 32000004 Australia False
5 32000005 Africa False
6 32000006 International False
7 32000007 Afghanistan False
8 32000261 False
9 32000262 False
10 32000263 False
11 32000264 False
12 32000265 False

The id field from the /locations endpoint serves as the parameter required to access the list of clans for a specific location. For example, if we want to query clans based in the Philippines, we’ll need to identify the corresponding locationId by filtering the list of locations accordingly.

In [32]:
countries[countries['name'] == 'Philippines']
Out[32]:
id name isCountry countryCode
177 32000185 Philippines True PH

With the locationId obtained, we are now equipped to query the list of clans by passing it as a parameter to the appropriate endpoint.

2nd API Request: Clans (Location as Parameter)¶

No description has been provided for this image

Read documentation

In [34]:
clan_params = {
    'locationId':32000185
}

CLAN_ENDPOINT = os.path.join(BASE_URL, 'clans')
In [35]:
clan_response = requests.get(url=CLAN_ENDPOINT, params=clan_params, headers=HEADERS)
In [36]:
clan_response
Out[36]:
<Response [200]>
In [41]:
clan_data = clan_response.json()['items']
In [43]:
ph_clans = pd.DataFrame(clan_data)
ph_clans.head()
Out[43]:
tag name type location isFamilyFriendly badgeUrls clanLevel clanPoints clanBuilderBasePoints clanCapitalPoints ... warWins warTies warLosses isWarLogPublic warLeague members labels requiredBuilderBaseTrophies requiredTownhallLevel chatLanguage
0 #2R0JVJJ80 #ARVALZ_3 open {'id': 32000185, 'name': 'Philippines', 'isCou... True {'small': 'https://api-assets.clashofclans.com... 3 13179 10940 0 ... 6 1.0 14.0 True {'id': 48000005, 'name': 'Silver League II'} 15 [{'id': 56000001, 'name': 'Clan War League', '... 0 8 {'id': 75000000, 'name': 'English', 'languageC...
1 #2RGRJ80CJ &re vs theworld closed {'id': 32000185, 'name': 'Philippines', 'isCou... False {'small': 'https://api-assets.clashofclans.com... 1 9568 9181 0 ... 1 NaN NaN False {'id': 48000000, 'name': 'Unranked'} 14 [{'id': 56000008, 'name': 'Farming', 'iconUrls... 200 5 {'id': 75000000, 'name': 'English', 'languageC...
2 #2JRGLRP00 )Pinoy only( open {'id': 32000185, 'name': 'Philippines', 'isCou... False {'small': 'https://api-assets.clashofclans.com... 2 8140 7294 0 ... 7 4.0 3.0 True {'id': 48000000, 'name': 'Unranked'} 11 [{'id': 56000000, 'name': 'Clan Wars', 'iconUr... 200 1 {'id': 75000000, 'name': 'English', 'languageC...
3 #2JPLRRUCQ *@NEWBEEZ@* inviteOnly {'id': 32000185, 'name': 'Philippines', 'isCou... False {'small': 'https://api-assets.clashofclans.com... 4 18043 13822 0 ... 9 NaN NaN False {'id': 48000009, 'name': 'Gold League I'} 15 [{'id': 56000000, 'name': 'Clan Wars', 'iconUr... 0 13 {'id': 75000021, 'name': 'Other', 'languageCod...
4 #2R2LYU02Y *Tron Awaken * open {'id': 32000185, 'name': 'Philippines', 'isCou... True {'small': 'https://api-assets.clashofclans.com... 2 5886 3995 0 ... 5 4.0 5.0 True {'id': 48000000, 'name': 'Unranked'} 13 [] 0 4 {'id': 75000000, 'name': 'English', 'languageC...

5 rows × 24 columns

Dataset Metadata and Description¶

In [46]:
ph_clans.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 832 entries, 0 to 831
Data columns (total 24 columns):
 #   Column                       Non-Null Count  Dtype  
---  ------                       --------------  -----  
 0   tag                          832 non-null    object 
 1   name                         832 non-null    object 
 2   type                         832 non-null    object 
 3   location                     832 non-null    object 
 4   isFamilyFriendly             832 non-null    bool   
 5   badgeUrls                    832 non-null    object 
 6   clanLevel                    832 non-null    int64  
 7   clanPoints                   832 non-null    int64  
 8   clanBuilderBasePoints        832 non-null    int64  
 9   clanCapitalPoints            832 non-null    int64  
 10  capitalLeague                832 non-null    object 
 11  requiredTrophies             832 non-null    int64  
 12  warFrequency                 832 non-null    object 
 13  warWinStreak                 832 non-null    int64  
 14  warWins                      832 non-null    int64  
 15  warTies                      494 non-null    float64
 16  warLosses                    494 non-null    float64
 17  isWarLogPublic               832 non-null    bool   
 18  warLeague                    832 non-null    object 
 19  members                      832 non-null    int64  
 20  labels                       832 non-null    object 
 21  requiredBuilderBaseTrophies  832 non-null    int64  
 22  requiredTownhallLevel        832 non-null    int64  
 23  chatLanguage                 620 non-null    object 
dtypes: bool(2), float64(2), int64(10), object(10)
memory usage: 144.8+ KB
In [48]:
ph_clans.describe().T
Out[48]:
count mean std min 25% 50% 75% max
clanLevel 832.0 7.908654 5.696396 1.0 3.00 7.0 11.00 30.0
clanPoints 832.0 12185.926683 6001.953419 970.0 7924.50 11072.5 15204.00 38472.0
clanBuilderBasePoints 832.0 10710.981971 6043.159458 20.0 6061.75 9788.5 13888.50 37802.0
clanCapitalPoints 832.0 0.000000 0.000000 0.0 0.00 0.0 0.00 0.0
requiredTrophies 832.0 900.961538 1460.980347 0.0 0.00 200.0 1125.00 5500.0
warWinStreak 832.0 2.159856 8.079906 0.0 0.00 1.0 2.00 152.0
warWins 832.0 109.834135 149.995183 0.0 12.00 45.5 152.00 992.0
warTies 494.0 4.977733 6.676949 0.0 1.00 3.0 6.75 64.0
warLosses 494.0 81.040486 121.327655 0.0 9.00 33.0 97.00 957.0
members 832.0 13.885817 8.168513 5.0 8.00 12.0 17.00 50.0
requiredBuilderBaseTrophies 832.0 639.543269 1341.079646 0.0 0.00 0.0 800.00 5500.0
requiredTownhallLevel 832.0 5.569712 4.664684 1.0 1.00 5.0 9.00 17.0

Data Analysis¶

Now that we’ve successfully queried the list of clans in the Philippines, we can proceed with analyzing the response data. One meaningful direction is to explore how clans are distributed based on their labels, which represent the characteristics or focus areas of each clan.

According to the Clash of Clans Wiki, common clan labels include:

  • Clan Wars
  • Clan War League
  • Trophy Pushing
  • Friendly Wars
  • Clan Games
  • Builder Base
  • Base Designing
  • International
  • Farming
  • Donations
  • Clan Capital
  • Friendly
  • Talkative
  • Underdog
  • Relaxed
  • Competitive
  • Newbie Friendly

By examining the distribution of these labels in our dataset, we can gain insights into what types of clans are most common in the Philippines.

In [88]:
labels = {}

for label_list in ph_clans['labels']:
    for label in label_list:
        label_name = label['name']
        if label_name in labels:
            labels[label_name] += 1
        else:
            labels[label_name] = 1
In [93]:
clan_labels = pd.DataFrame(list(labels.items()), columns=['Label', 'Count'])
clan_labels = clan_labels.sort_values(by='Count', ascending=False)

initial_df = clan_labels.head(5)

fig = go.Figure()

fig.add_trace(go.Pie(
    labels=initial_df['Label'],
    values=initial_df['Count'],
    hoverinfo='label+percent',
    textinfo='label+value'
))

dropdown_options = [
    {"label": f"Top {i}", "method": "restyle",
     "args": [{"labels": [clan_labels['Label'].head(i)],
               "values": [clan_labels['Count'].head(i)]}]}
    for i in range(3, len(clan_labels) + 1)
]

fig.update_layout(
    title="Distribution of Clan Labels",
    updatemenus=[{
        "buttons": dropdown_options,
        "direction": "down",
        "showactive": True,
        "x": 1.1,
        "y": 0.5,
        "xanchor": "left"
    }]
)

fig.show()

Are there noticeable patterns or significant differences in clan points based on the different labels assigned to clans?

In [121]:
clan_label_x_points = {i: [] for i in labels.keys()}

for clan_label in clan_label_x_points.keys():
    _ = ph_clans[ph_clans['labels'].apply(lambda x: any(i['name'] == clan_label for i in x))]
    clan_label_x_points[clan_label].append(_['clanPoints'])
In [129]:
for label in clan_label_x_points:
    clan_label_x_points[label] = pd.concat(clan_label_x_points[label])

# Setup subplot grid (e.g., 5 rows x 4 columns = 20 slots, 17 used)
rows = 5
cols = 4
fig = make_subplots(rows=rows, cols=cols, subplot_titles=list(clan_label_x_points.keys()))

row, col = 1, 1
for i, (label, points_series) in enumerate(clan_label_x_points.items()):
    fig.add_trace(
        go.Histogram(x=points_series, name=label, nbinsx=20),
        row=row, col=col
    )
    col += 1
    if col > cols:
        col = 1
        row += 1

# Final layout adjustments
fig.update_layout(
    height=1000, width=1200,
    title_text="Distribution of Clan Points by Label",
    showlegend=False
)

fig.show()

Labels with Higher Clan Points:

  • Clan War League and Clan Wars show a strong concentration of clans with higher clan points, often peaking between 10k–30k, with some reaching even higher.
  • Competitive clans also show a wider and more evenly spread distribution, suggesting a more active and higher-performing player base.

Labels with Lower or Mid-range Points:

  • Newbie Friendly, Donations, Friendly, and Talkative clans show a more left-skewed distribution, indicating that most clans under these labels have lower to moderate clan points.
  • These likely cater to casual or beginner players.

Sparse or Limited Data Labels:

  • Labels such as Base Designing, Builder Base, Underdog, and International show very limited data, making it harder to determine clear trends. However, the low frequencies suggest these may be niche or less commonly used labels.

Balanced or Broad Spread:

  • Relaxed, Clan Capital, and Trophy Pushing exhibit relatively broad distributions, indicating a mix of both competitive and casual clans under these tags.

Interpretation:¶

Clan labels are indeed associated with different play styles, and this is reflected in their point distributions. Competitive-oriented labels like Clan Wars and War League attract or correlate with higher clan performance, while social or casual labels such as Friendly, Talkative, and Newbie Friendly show lower performance on average.

This insight can help in segmenting player behavior, designing targeted clan recruitment, and even informing game balancing or community management strategies.

No description has been provided for this image

Instruction: Analytical Exploration of Clash of Clans Data¶

You are tasked with analyzing the Clash of Clans data (using the available API endpoints) for a specific country. Your goal is to uncover meaningful insights that can help understand trends, behavior, and performance within the game environment.

Tasks:¶

  1. Formulate Analytical Questions

    • Develop 3–4 analytical questions related to the available data from the Clash of Clans API (including clans, players, war logs, trophies, and more).
    • Think critically about what insights can be drawn from different parts of the data—whether that’s trends in player performance, clan activity, or comparisons across various metrics.
    • Your questions should guide your analysis and shape the conclusions you aim to draw.
    • Feel free to use AI tools (like ChatGPT, Bard, or Copilot) to help brainstorm questions, structure your analysis, or clarify how different parts of the data relate to one another.
  2. Analyze the Data

    • Use Python with libraries like pandas, seaborn, and matplotlib to analyze the data.

    • Consider questions such as:

      • How do trophies correlate with clan level or war activity?
      • What are the patterns in player behavior (e.g., win rates, participation in wars, etc.)?
      • Are there any notable country-specific trends (e.g., number of clans, trophies, or member distribution)?
      • How does clan size affect performance or participation in events?
  3. Visualize Your Findings

    • Present your findings through visualizations that clearly communicate your insights. These might include:

      • Bar charts for distributions (e.g., clan levels, trophies, war participation)
      • Boxplots for variations (e.g., trophies by clan level or war frequency)
      • Scatter plots to visualize relationships (e.g., trophies vs. clan level, member count vs. activity)
    • Ensure the visuals are clear, informative, and help to support the analysis you conduct.

  4. Present Your Results

    • Prepare a short presentation (3–5 minutes) summarizing:

      • Your analytical questions
      • The methods used for analysis
      • The key findings from your exploration
      • How these insights might inform strategy, performance, or behavior in Clash of Clans.
    • Explain why you selected those questions and how your analysis provides a deeper understanding of the game’s mechanics.

Prepare to share your findings with the class and discuss how your analysis could be applied to improve player or clan strategies in Clash of Clans!

Homework: Reflection Paper – Data Science Journey¶

After completing the workshop and the course, write a brief reflection paper (1 page) addressing the following:

  1. Summarize your experience in the data science workshop—what you did, how you approached the task, and what tools or techniques you used.
  2. Reflect on your understanding of data science before and after the course. How has it changed? What concepts or skills stood out the most?
  3. Discuss the real-world value of data science. Based on your experience, where do you see data science being useful (in games, business, research, society, etc.)?
  4. Share your personal insights or challenges. What part of the course or activity did you find most engaging or difficult, and why?
  5. Conclude with your future outlook. How do you see yourself using (or not using) data science in your academic or professional journey?
  6. Submit it in USTEP.
In [ ]: